Variable Selection for K - Means Quantization

نویسنده

  • C. LEVRARD
چکیده

Recent results in quantization theory provide theoretical bounds on the distortion of squared-norm based quantizers (see, e.g., [3] or [10]). These bounds are valid whenever the source distribution has a bounded support, regardless of the dimension of the underlying Hilbertian space. However, it remains of interest to select relevant variable for quantization. This task is usually performed using coordinate energy-ratio thresholding (see, e.g., [1] or [17]), or maximizing a constrained empirical Between Cluster Sum of Squares criterion (see, e.g., [4] or [22]). This paper offers a Lasso type procedure to select the relevant variables for k-means clustering, as exposed in [18]. Moreover, some non-asymptotic convergence results on the distortion are derived for this procedure, along with consistency results toward sparse codebooks.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Fuzzy Clustering Approach Using Data Fusion Theory and its Application To Automatic Isolated Word Recognition

 In this paper, utilization of clustering algorithms for data fusion in decision level is proposed. The results of automatic isolated word recognition, which are derived from speech spectrograph and Linear Predictive Coding (LPC) analysis, are combined with each other by using fuzzy clustering algorithms, especially fuzzy k-means and fuzzy vector quantization. Experimental results show that the...

متن کامل

Sparse oracle inequalities for variable selection via regularized quantization

We give oracle inequalities on procedures which combines quantization and variable selection via a weighted Lasso k-means type algorithm. The results are derived for a general family of weights, which can be tuned to size the influence of the variables in different ways. Moreover, these theoretical guarantees are proved to adapt the corresponding sparsity of the optimal codebooks, suggesting th...

متن کامل

A hybrid DEA-based K-means and invasive weed optimization for facility location problem

In this paper, instead of the classical approach to the multi-criteria location selection problem, a new approach was presented based on selecting a portfolio of locations. First, the indices affecting the selection of maintenance stations were collected. The K-means model was used for clustering the maintenance stations. The optimal number of clusters was calculated through the Silhou...

متن کامل

Comments on " Modified K - means algorithm for vector quantizer design

Recently a modified -means algorithm for vector quantization design has been proposed where the codevector updating step is as follows: new codevector = current codevector + scale factor (new centroid current codevector). This algorithm uses a fixed value for the scale factor. In this paper, we propose the use of a variable scale factor which is a function of the iteration number. For the vecto...

متن کامل

On two extensions of the vector quantization scheme

Abstract: In this paper, we present results pertaining to two different extensions of vector quantization and the related question of k−means clustering. The first part of the paper is about the theoretical performance of quantization and clustering with Bregman divergences. The second one is dedicated to model selection issues for principal curves. Some numerical illustrations are provided in ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016